## Multigrain Parallelization and Compiler/Architecture Co-design for 30 Years ## Hironori Kasahara Waseda University, Japan Abstract. Multicores have been attracting much attention to improve performance and reduce power consumption of computing systems facing the end of Moore's Law. To obtain high performance and low power on multicores, co-design of hardware and software especially parallelizing and power reducing compiler is very important. OSCAR (Optimally Scheduled Advanced Multiprocessor) compiler and OSCAR multiprocessor/multicore architecture have been researched since 1985. This talk includes OSCAR multigrain parallelization compiler that hierarchically exploits coarse grain task parallelism, loop parallelism, and statement level parallelism, global data locality optimization over coarse grain tasks for cache and local memory automatic power reduction controlling frequency and voltage control, clock and power gating, heterogeneous task scheduling with overlapping data transfers using DMA controllers software coherence controls by OSCAR compiler local memory automatic management with software-defined block and its replacement, performance and power consumption of real applications including automobile engine control, cancer treatment, scientific applications and so on various multicore systems, such as Intel, ARM, IBM, Fujitsu, Renesas, Tilera and so on.